Picture for Yuandong Tian

Yuandong Tian

GaLore 2: Large-Scale LLM Pre-Training by Gradient Low-Rank Projection

Add code
Apr 29, 2025
Viaarxiv icon

R-Sparse: Rank-Aware Activation Sparsity for Efficient LLM Inference

Add code
Apr 28, 2025
Viaarxiv icon

Param$Δ$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

Add code
Apr 23, 2025
Viaarxiv icon

SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks

Add code
Mar 19, 2025
Viaarxiv icon

NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions

Add code
Feb 18, 2025
Viaarxiv icon

LLM Pretraining with Continuous Concepts

Add code
Feb 12, 2025
Viaarxiv icon

Spectral Journey: How Transformers Predict the Shortest Path

Add code
Feb 12, 2025
Viaarxiv icon

SHARP: Accelerating Language Model Inference by SHaring Adjacent layers with Recovery Parameters

Add code
Feb 11, 2025
Viaarxiv icon

GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?

Add code
Feb 07, 2025
Viaarxiv icon

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning

Add code
Feb 05, 2025
Figure 1 for Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
Figure 2 for Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
Figure 3 for Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
Figure 4 for Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
Viaarxiv icon